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Abstract 

Based on direct integrals, a framework allowing to integrate a parametrised family of re- 
producing kernels with respect to some measure on the parameter space is developed. By 
pointwise integration, one obtains again a reproducing kernel whose corresponding Hilbert 
space is given as the image of the direct integral of the individual Hilbert spaces under the 
O summation operator. This generalises the well-known results for finite sums of reproducing 

D kernels; however, many more special cases are subsumed under this approach; so-called Mercer 

tin kernels obtained through series expansions; kernels generated by integral transforms; mixtures 

of positive definite functions; and in particular scale-mixtures of radial basis functions. This 
opens new vistas into known results, e.g. generalising the Kramer sampling theorem; it also 
offers interesting connections between measurements and integral transforms, e.g. allowing to 
apply the representer theorem in certain inverse problems, or bounding the pointwise error in 
the image domain when observing the pre-image under an integral transform. 



. Keywords: reproducing kernel; integral transform; radial basis function; scale-mixture; 

i^H Kramer sampling; representer theorem. 

, ^, 1 Overview 

T-H Reproducing kernel Hilbert spaces (r.k.h.s.s) play an important role in many branches of matli- 

ematics, the corresponding reproducing kernels (r.k.s) also being called positive definite functions 
(p.d.f.s), radial basis functions (r.b.f.s) or autocorrelation functions (a.c.f.s) in special cases, see 
e.g. (Wendland 2005). Often, r.k.s are constructed via integral transforms, cf. Section 4.1 and 
the references therein; by orthogonal expansions, so-called Mercer kernels, cf. Section 4.1.3; or 
by scale-mixtures of r.b.f.s, cf. Section 4.2.2 and the references quoted there. In fact, all of these 
r.k.s share a common characteristic, as we will show below: they are obtained by integrating a 

^SJ parametrised family of r.k.s over the parameters; note that this also covers summation of kernels 

via integration with respect to a discrete measure. 
^ Our aim is therefore to present an abstract framework, based on direct integrals, for the 

•'-j integration of r.k.s; this will be developed in Section 3 where Theorem 3.1 clearly states conditions 

under which this is possible, allowing the r.k. to be calculated through pointwise integration, while 
also characterising its r.k.h.s. as the image of the direct integral of the individual Hilbert spaces 
under the summation operator. The aforementioned special cases of integral transforms. Mercer 
kernels, mixtures of p.d.f.s and r.b.f.s will be shown to be direct consequences of this theorem in 
Section 4, noting connections to some classical results of Bochner (1933) and Schoenberg (1938). 
This framework also allows to view sampling equations, in particular Kramer sampling, from 
a slightly different and more general perspective in Section 5. Finally, we will point out some 
interesting relationships between the r.k.h.s. obtained by integration and the pre-images in the 
L2-space in Section 6, allowing the use of the representer theorem to solve an "inverse problem" 
over the L2-space in Proposition 6.2, or bounding the pointwise error in the image domain by 
Proposition 6.4. 

However, before we start to develop the abstract framework, we recall the basic definitions, 
and the simplest, classical case of the direct sum of two r.k.h.s. in Section 2. This will form the 
starting point of which our abstract framework will be a generalisation. 
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Goettingen, Goldschmidtstrasse 7, 37077 Goettingen, Germany; hotzamath.uni-goettingen.de. 
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2 Direct sums of reproducing kernel Hilbert spaces 



Recall the notion of a reproducing kernel Hilbert space (r.k.h.s.): let H be a Hilbert space of 
functions from some set X into C — everything that follows applies equally if the scalar field is 
R — with scalar product (•, •); then the function K defined on A" x A' is called a reproducing kernel 
(r.k.) if for every y G X we have K{-, y) G H as a, function of the first argument, and K possesses 
the reproducing property, i.e. for all / e H and y £ X we have 

fiy)^{f,K{;y)). (1) 

Then, H is called a r.k.h.s. over X with r.k. K. 

In his seminal article, Aronszajn (1950) proved that for two r.k.h.s. Hi and over the same 
set X with scalar products (•, •)i and (•, •)2, together with r.k. Ki and K2, respectively, their sum 



n^Hi+n2 = {f = fi + f2 : /ieHi,/2eH2} (2) 

is again a r.k.h.s., with r.k. 

K^Ki+K2 (3) 
and norm || • || given by 

ll/ir- .. .inf + (4) 

J—J1+J2 ■ /lfcttl,J2fcH2 

see (Aronszajn 1950, §1.6). He proved this by considering first the direct sum 

■H^Hi®H2 (5) 
with scalar product given by 

((/i,/2),(gi,52))~ = (/i,. 91)1 + (/2, 52)2. (6) 
Then, H = TZiS), the range of the summation operator 

S:H^H, {f,g)^f + g, (7) 
whose null space 

W) = {(/,-/) : feninn2} (8) 

is closed, whence 5 is a bijection between A/'(S')^ and TZ{S) — H, and one can push forward the 
induced Hilbert space structure from H to H. The corresponding r.k. K is then easily seen to 
be given by K{-,y) = SK{-,y) where K{-,y) = {Ki{- ,y), K2{- ,y))- All one uses for this proof is 
the reproducing property of the respective kernels. It is clear that one can inductively obtain the 
r.k.h.s. corresponding to the sum of finitely many kernels. 



3 Abstract framework 

We now want to generalise the summation of kernels in the previous section to the integration 
of kernels. Towards this end, we first of all need an analogue of the direct sum in (5). For this, 
assume the index set fl features a cr-algebra A, and /i is a measure on that measurable space 
{il,A). Moreover, for every ui G fl let he given a r.k.h.s. H^i over X with scalar product (•, •)^, 
induced norm || • and r.k. JC^. 

These r.k.h.s.s need to be related in a measurable way: we assume that there is a partition 
of fl into measurable sets r2„ € y^, n e N°° = N U {oo}, and isometrics : — > C" for 
cj G f^n where we denote ^2(C) by C°° for uniformity in exposure. We then call a cross-section 
I — (/w)wgO of e 'Hu, oj G^ measurable if for all n G N°° the maps to i-> E^f^ : fi„ C" are 
measurable, whence w i-> H/wHw : ^ — > [0,oo) is measurable, too. 

A natural generalisation of the direct sum is then given by the direct integral 




:/ is a measurable cross-section and j ^\\f^\\l A < oo^ , (9) 
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which is again a Hilbert space with scalar product 

/ {U,9.)^dfi{oj) (10) 



Jn 

and norm ||/||^ ~ {fif)~ if we identify f £ H and g £ H in case \\fi^ — g^^W^ dfi{uj) = 0. A 
worthwhile introdution into this topic can be found in (Nielsen 1980). 

Note that the direct integral reduces to the direct sum if H. is finite, A its power set, and n is 
the counting measure on J7. 

Next, we need a summation operator S as in (7), mapping H into the space J^i^) of functions 
over X endowed with the topology of pointwise convergence. We define S pointwise by setting 

(5/)(x)= / Ux)dti{uj)^ f {f^,K^i;x))^df,iuj)= V / {E^U,E^KU;x))c'^ dn{Lj) 
Jn Jn neN°° 

(11) 

for every cross-section f G H and every point x G X. The integrands in the last expression are 
clearly measurable if ^(-,2;) = {Ki^{-,x))uien & whence {Sf){x) is well-defined if 



n 



\{L,KU■,x))^\d^liL,) (12) 

By applying Cauchy-Schwarz twice, we can estimate 

/ m,KU■,x))^\d^iiL,)< f \\LL\\KU■,x)\ld^l{Lu) 

Jn Jn 

<( f \\Uld^,{u;)] if \\KU;x)\\ldt,{u;)] 



\Jn I \Jn I 

= II/IUII^(-,^)IU 

< cx) (13) 

by assumption. The cross-section iir(-, x) is in "H for x € A" if it is measurable and its norm in l-i 
is bounded; the latter is given by 



||X(-,x)||l- / ||i^^(.,x)|jid/i(w)= / K^{x,x)d^{^). (14) 
Jn Jn 

Furthermore, the operator S* : H — > J-{X) is continuous; indeed, if a sequence /^"^ e H 
converges to some f E H then 



Ux)-&Hx)dK^) 



n 



< / \{U-&\KU;x)\dtiiio)<\\f-f^^^^\^\\K{;x)\\^^0 (15) 



n 



for n — > oo. Hence, the null space of S, 

^{S) = [fen : / /dA*(c.)-0}, (16) 



is closed and is a vector space isomorphism between J\f{S)-^ and H = TZ{S). We thus endow 
H with the Hilbert space structure turning S into an isometry, i.e. we set for /, g G H the scalar 
product of Sf, Sg & v. to be 

{Sf,Sg)^{f,g)^. (17) 
The obvious candidate for the r.k. on H is then given by 

K{-,x) ^ SK{-,x) (18) 
for every x & X. Note that K{-,x) e JV{S)^ for every x G X since for every f gH 

{f,K{;x)}^ = [ {L,K^i-,x))^d^i{io)= [ Ux)d^i{io) = {Sf){x), (19) 
Jn Jn 
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which is for / G Af{S). Furthermore, for any decomposition / = /'^' + /'^^ e H with J*^^' e 
J\fiS), G A/'(S')-L and any x e A" 

= {f^^\Ki-,x))^ = {f,K{-,x))^ = {Sf){x), (20) 

which shows the reproducing property of K for "H. 
Finally, the norm || • || on "H is given by 

\\Sff^{Sf,Sf)^{f'^'\f^'^)^^\\f(^^\\i^ ^inf ll^e (21) 

gen : Sg=Sf 

for / = /(I) + /(2) e ^ with /(I) e Af{S), e 7V(5)-L since the map / ^ / (2) is the orthogonal 
projection onto ^{3)-^. 

Let us summarise what we have obtained: 

Theorem 3.1 Assume that for each index uj Cz we are given a r.k.h.s. over some set 

X with r.k. and norm \\ ■ \\u>; let % he the direct integral of the "Hcj with respect to the 

measure space (r2,,4, /i). Furthermore assume that for every x € X the cross-section of the r.k.s 
K(-,x) ~ {Ki^{-, x))i^(zQ e H, i.e. it is measurable and 



\\K{-,x)\\l^ / Kjx,x)dfi{uj) <(x,. (22) 
for every x ^ X . Then, 

n = \jj^Aii{Lo) : (/„)^ej,eH| (23) 

is a r.k.h.s. over X with r.k. K given by 

K{x,y)= f K^{x,y)Aii{Lo) (24) 

for every x^y £ X . The norm \\ ■ \\ onT-L is given by 

Wff^ ^ inf / h.r^K^) (25) 

gen : f=f^g^dp,{uj) Jn 

for any f £%. Also, for every x £ X, k(-,x) £ N{S)^ with |li^(-, a;)||^ = K{x,x). 

Remark 3.2 If /i is a finite measure, and if the r.k.s are uniformly bounded, i.e. there is a constant 
c < oo such that Ki^{x, x) < c for all x £ X and /i-a.e. uj £ fl, then K(x, x) < cfi{fl) < oo, so K is 
uniformly bounded. If in addition, A" is a topological space and for every x £ X and /i-a.e. uj £ Q 
we have that if^(-,x) is continuous, then H C Ci,{X), the latter denoting the space of uniformly 
bounded continuous functions over X. 



Let us note that the case of a direct sum of finitely many r.k.h.s.s considered in Section 2 is 
obtained as a special case of Theorem 3.1 by endowing r2 = {l,...,ri},neN with the counting 
measure. 

A crucial - but not easily verifiable - ingredient to obtain a r.k.h.s. via integration is that 
K{-,x) = {K^{-,x))u,en is a measurable cross-section in the sense we introduced at the beginning 
of this section, i.e. there exists a collection of isometries {E^ : Huj C"}t^go such that w i-)- 
EuiK^^[-,x) is measurable. We will now state some conditions which guarantee this and are easy 
to check. The following Proposition and Lemma are taken from (Nielsen 1980, §2.8). 

Proposition 3.3 Let be a measure space and {-ffi^jc^eo Hilbert spaces. Suppose that F is a 
countable set of cross-sections such that 

1. for each uj £ the family {fu)\f£F is dense in H^^; and 

2. the map lo i— )■ {fu,guj)ui *s measurable for all f,g £ F. 

Then there exists a collection of isometries {-Ec^jlweo such that all f £ F are measurable cross- 
sections. This collection of isometries is unique in the sense that if E^^ is another such collection 
then their sets of measurable cross-sections agree. 
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Lemma 3.4 Let F and be as in the Proposition 3.3. Then a necessary and suffi- 

cient condition that a cross-section g is measurable w.r.t. the collection {-Ei^jtjgo is that lj M- 
(giio), f(u}))^ is measurable for all f £ F . 

Hence in the case of all T-L^^ being r.k.h.s. over a separable topological space A", we have the 
following corollaries: 

Corollary 3.5 Let X be a separable topological space, U d X a countable, dense subset. Assume 
that {Kt^{-, y) ■ y (z U} is dense in T-L^^ for each uj and that lo i— )> K^{x, y) is measurable for every 
X € X and y £U . Then there exists a collection of isometrics {i?w}weo such that {Ki^{-,x))ujen 
is a measurable cross-section for all x £ X . 

Corollary 3.6 Assume for uj £ VL are r.k.h.s. s over a separable topological space X with r.k.s 
Ki^. Moreover, assume that the maps 

1. X i-^ Ki^{x,x), as well as 

2. X i-^ Ki^{x,y) for all y in a countable, dense subset U of X 

are continuous for all w G f2, and furthermore the maps uj i— > K^{x,y) are measurable for all 
x,y e U. Then there exists a collection of isometrics {Ei_^}^^n such that {K^{-,x)) 

cjtEO ^s a 

measurable cross-section for every x £ X . 

Proof Obviously, we want to use Proposition 3.3. Therefore we need to show that 

- clos«^ ispan{KU-,y) ■ V e U}) . (26) 

This is an immediate consequence of the continuity assumptions, the denseness of U , and T-L^^ — 
clos-H^ (span{ii't^(-, x) : x £ X})\ in fact, for every e > and x E X there exists some y G U such 
that 

\\KU;x) - KU;y)\\l = KUx,x) - 2-Re{K^{x,y)) + KUy.y) < e. (27) 

By Lemma 3.4 the map lu Ki^{-,x) is measurable for x £ X\U iff w i-> K^{x, y) is measurable for 
all y € U. The latter is true since we can choose a sequence a;„ G [/, rt S N with lim„_>.oo Xn = x, 
whence uj i— ^ Kuj{x,y) is the limit of the measurable functions uj i— > Ki^{xn,y) and thus itself 
measurable. □ 

Remark 3.7 Note that Theorem 3.1 can be viewed as a special case of the abstract framework 
in (Saitoh 1983): for some Hilbert space T-L consider a map k : X ^ T-L. Then the set T-L = {x ^ 
(/, k{x))^ : f G H} of functions over X forms a r.k.h.s. with kernel K{x, y) = (fc(y), k{x))^. For 
l-L the direct integral above, / any cross-section, and the special cross-sections k{x) — {K^{-, x))(^gn 
this abstract construction in fact coincides with the one described above; in this setting, S is given 
hy SU){x)^{fMx))^- 

Remark 3.8 Along the same lines as above. Theorem 3.1 may be generalised to Hilbert space 
valued r.k.h.s.s (h.r.k.h.s.s). Recall the notion of a h. r.k.h.s., see e.g. (Carmeli et al. 2006): in 
analogy to the scalar case, this is a Hilbert space T-L of functions over a set X with values in 
some separable Hilbert space E such that point evaluation is continuous. Denoting by B{£) the 
set of bounded linear mappings from £ to itself, continuity of point evaluation is, by the Riesz 
representer Theorem, indeed equivalent to the existence of a kernel K : X x X ^ t3{£), with the 
property that for all x € X and w £ £, the mapping y i— K{y,x)w is in %, while fulfilling the 
reproducing equation 

{f,K{;x)w)n^{f{x),w)e (28) 
for every f EH. Note that this r.k. K{x,x) is necessarily self-adjoint and satisfies 

\\f{x)U < ^\\Kix,xmf\\n- (29) 

The latter shows that, given a collection of h. r.k.h.s. T-L^ with r.k.s K^^ over a measure space (f2, /i), 
any cross-section in the corresponding direct integral, which by definition is a square integrable 
function in the sense of the Bochner integral, is Bochner integrable if 



\Ki^{x, x)\\dfi{x) < oo, (30) 
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cf. (13). Moreover, it is easily verified by the properties of the Bochner integral that the operator 
defined by 

K{x,y) : w i-^ Ki^{x,y)w dfj,{uj) for all a;, y G A" (31) 
Jn 

is in B{£), and K satisfies the properties of a r.k. for the image of the direct integral under the 
summation operator S obtained by a pointwise application of the Bochner integral, cf. (11), the 
image being endowed with the pullback inner product under S, cf. (17). 

Note that this is still a special case of (Schwartz 1964, Proposition 20, p. 170) if one views 
h.r.k.h.s.s as continuously embedded subspaces of the space of all functions over X endowed 

with the topology of pointwise convergence. 



4 Special cases 

4.1 Integrating finite-dimensional reproducing kernel Hilbert spaces 

It is well-known that a huge class of r.k.h.s.s are describable as the range of some integral transform, 
take e.g. the Paley- Wiener spaces. We state some general conditions under which such kernels 
can be obtained via Theorem 3.1; cf. (Saitoh 1987) for an extensive treatment of this topic. 

Corollary 4.1 In the setting of Theorem 3.1, assume there is a function k : X x Q ~> C such 
that for every X Cz X , the function uj i-^ k[x,u;) is in L2{^, fJ-). Then the r.k.h.s. H associated with 
the kernel 



K{x,y)^ / k{x,OJ)k{y,Lu)d^l{u:) (32) 
Jn 

is given by Theorem 3.1, i.e. % is the range of the operator S : L2(i7) — >■ T{X), 

{Sa){y) ^ / a{uj)k{y,u})dfi{uj) . (33) 



Proof Let Huj = span{fc(-,aj)} C J^iX) endowed with the obvious inner product of the coeffi- 
cients, i.e. ||fc(-, aj)||(j — 1, the isometrics being given by : Ti.^ — > C, k{-,uj) i-^ 1. Then, the 
r.k.s are given by 

K^{x,y)^k(x,uj)k{y,uj). (34) 



Therefore, the cross-section K{-,y) is measurable iff w i~> k{y,uj), which we assumed. We then 
have 

n^iuj^ a{uj)k(-,uj) : aeL2(17,^)} (35) 

with 

||w ^ a{uj)k{-,uj)\\^ = ||a||L2(n,A.)- (36) 
At last we have for all a; e X, 



K^{x,x)d^i{u}) ^ \k^{x)\ d^(a;) = ||fc.(a;)|lL2(j7,^) < oo (37) 
; Jn 

by assumption. Hence, Theorem 3.1 is applicable. □ 

Remark 4.2 Note that one can extend the above theorem to the case in which each H^j is the 
span of a finite number of such maps /c; : il x A" — > C for i = 1, . . . , d in the obvious way without 
changing the proof. 

We will now revisit two classical examples. 
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4.1.1 Paley- Wiener space 

Consider the continuous Fourier transform F : L2(R) — > L2(R), / J-^ f{x) e^'^^'^'^ dx, which is 
an isometry, let C R be compact, further let S{^1) = {/ € L2(R) : ^ ft f{u!) = 0} be the 
closed sub-space of functions with support in Q, and denote the corresponding Paley- Wiener space 
of fi-band-limited functions by PW(f7) = F^^{S{fl)). Note that iS(J7) is isometrically isomorphic 
to L2(ri) by restriction to fl. 

Applying Corollary 4.1 to the function 

fc : R X 17 -> C, {x,uJ)^-^ e^"^" , (38) 

which obviously satisfies the assumption, and using the Lebesgue measure on leads to the Hilbert 
space 

= J a(uj)e^"'"^ duj : a e L2(f7)| (39) 

with norm ||/|| = ||a||L2(n) = II/I|l2(R)- The latter is true by Parseval's identity the former by the 
isometry of L2(ri) and 5(0). Hence H is indeed the space PW(r2). Moreover, we can conclude 
that 

Ki-,y)= I K^{■,y)A^l[u)^ ( exp(*c^(. - y)) d/i(c^) (40) 



is the r.k. for PW(r2), e.g. for ^ [-\, \], we obtain the r.k. for PW([-i, \]) as 

exp(27riw(x - y)) d^J.{u) = — V = sinc(7r(a; - y)) . (41) 

.1 ■n{x-y) 

Note that the latter is the well known kernel of the Paley- Wiener Space, see e.g. (Wendland 2005, 
Theorem 10.12). 

4.1.2 Global Sobolev kernels 

The global Sobolev space W^(R'*) is the space of functions with weak derivative up to order m 
being in L2(R'^). One endows this space with the inner product 



(/,5)vv,"-(R-) - / {l + \\iory"Ff{u)Fg{Lo)Auj, (42) 

where F : L2(R) — ?> L2(R) denotes the Fourier transform again. It is well known, see e.g. (Nashed 
and Walter 1991, eq. (5.26)) or again (Wendland 2005, Theorem 10.12), that this space possesses 
the r.k. 

K{x,y)^ f {l + \\ujfy"'e^'''^^-y^'^duj (43) 

if d < 2m. This can be seen from Corollary 4.1 by considering the function 

fc : R'' X R'* ^ C, (x, Lo) ^ e^'^*^'", (44) 

which is uniformly bounded by 1, while w i— >■ k{x,Lo) is measurable for each a; G R since it is 
continuous. Moreover, choose the measure /i on R'' s.t. 

d^l{Lu) = {l + M''y""dLJ, (45) 

denoting the Lebesgue measure by duj. Surely, we have /^(R'^) < oo by c? < 2m and hence 
k{x, •) e L2(R'*,/i). Therefore we can apply Corollary 4.1 to obtain a r.k.h.s. "H fulfilling 



= / a{uj)b{uj)d^iiuj) 



(1 + \\ijfrFF-\aiij){l + ||c^||2)-™)FF-i(5(w)(l + ||w||2)-™)dw 

R'i 



(1 + ||c.||^)™^^(/)F(.g)dc. = {f,g)w,^ (46) 

for / = /j^^a(w)e2^"'"d^(a;), g = /j^^ 6(a;)e2'^"'"d^(w) and a, 6 e Ui^,^^)■ Note that here, 
JV{S) — {0} by the injectivity of the Fourier transform, i.e. the representations in the direct 
integral are unique, so the scalar product is obtained by integrating over the individual scalar 
products. Hence H = W™(R'^) with r.k. given in (43). 
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4.1.3 Expansion kernels 

Consider a collection G -^(<^), n G N, of linearly independent functions over X; let A„ > for 
n € N and assume X^„gisf A„|(/3„(a;)p < oo for all x G X. Clearly, Corollary 4.1 is applicable if we 
let ri = N, /i({a;}) = A^j and k{-,uj) = (/?^. The resulting Hilbert space is 

^ = i E ^ E < j- (47) 



with kernel given by 

K{x,y) = ^ A„¥3„(a;)(y9„(?;). (48) 
By the linear independence of the ansatz functions ipn, the kernel of S : L2(ri) = £2 ((A„)„gN) = 

{(On)nGN : SnGN ^nl^nP < Oo} ^ H, (a„)„gN H> X^nGN ^n^^tpn is trivial, J\f{S) = {0}, whcnCC 

S is an isometry between T-L and the space of sequences square-summable w.r.t. the weights A„. 
Putting 7„ = XnCLji, we obtain an isometry to the space ^2((A~^)„gN) such that / G H iff 
/ = X^nGNT"'/'" (7n)neN € ^2 ((A~^)„gN) 7 the spacc of sequences square-summable w.r.t. the 
weights A^^. Hence, ('\/A^'Pn)neN forms an orthonormal basis of "H. 

4.2 Integrating infinite-dimensional reproducing kernel Hilbert spaces 

We will now consider cases in which the individual r.k.h.s.s T-L^^ are in general infinite-dimensional. 

4.2.1 Positive definite functions 

Assume that A" is a group with neutral element e. We then call ip : X ^ C a positive definite 
function (p.d.f.) if it gives rise to a r.k. H via 

H{x,y) ^ ip{xy-^), x,y e X; (49) 

such a kernel is called translation invariant. 

The classical example is a finite-dimensional Euclidean vector space, X = IV^, d G N. The 
famous Theorem of Bochner (1933) characterises the p.d.f.s in this case completely: '0 is a p.d.f. 
if and only if there is a finite measure /i on R'* such that 

ipix) = / exp(za;'a;) d^(w), Vx G A". (50) 

Since ip^^ : X ^ C, x i-^ exp(iti;*a;) is a p.d.f. for every uj, K^{x,y) = exp{iu!*{x — y)) having 
been used several times above, the easy "if" part of Bochner's Theorem is a direct consequence of 
Theorem 3.1 with Corollary 3.6. In fact, we have more generally: 

Corollary 4.3 Let X be a separable topological group, (f2, fi) a measure space, and ip : X xD, ^ C 
a map such that ip{x, •) is in L2(r2, fi) for every x Cz X while ip{-, u) is a continuous p.d.f. for every 
cj G ri. Then 



^{■)= / ip{;Uj)dfliLj) (51) 

Jn 

is a p.d.f. again. 

4.2.2 Radial basis functions 

We will now consider radial basis functions associated with r.k.h.s.s: we call a mapping tp : 
[0,oo) — >■ C a radial basis function (r.b.f.) on the metric space (A", A) if it gives rise to a r.k. H 
via 

Hix,y)^ip{A{x,y)), x,yeX. (52) 

For separable Hilbert spaces A, Schoenberg (1938) gave a characterisation of its r.b.f.s as scale 
mixtures: V' is a r.b.f. if and only if there exists a finite measure on [0, oo) such that, Ja denoting 
the a-th Bessel function of the first kind, 

r°° d-2 

m= r{l){^) - J^{u6)dK^), V<5g[0,oo), (53) 
Jo 
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in case X is d-dimensional for some d G N, or 

ip{d)= exp{~{u6f)dfi{oj), V5e[0,oo), (54) 
Jo 

in case X is infinite-dimensional. Again we can generalise the simpler "if" parts of Schoenberg's 
theorems using Theorem 3.1 and Corollary 3.6: 

Corollary 4.4 Let (17, /z) be a measure space, and cp : [0, oo) x 17 — >■ C a map such that ip{d, •) is 
in L2(ri,/i) for every 6 S [0,oo) while (p{-,io) is a continuous r.b.f. on the separable metric space 
{X , A) for every w G il. Then for any finite positive measure fj, on Q, 

V'(-)= / ipi;Uj)dfl{iu) (55) 

Jn 

is also a r.b.f. on (A", A). 

In fact, the last result is rather well-known for autocorrelation functions, see e.g. (Yaglom 
1987, p. 355), where it has been used to construct stationary and isotropic Gaussian random 
fields with certain desirable properties, e.g. through scale-mixtures of the Euclidean hat function 
by Gneiting (1999). More generally, scale-mixtures of compactly supported r.b.f. have been 
considered by Buhmann (1998) since they lead to numerically favourable band matrices. 



5 Sampling 

The classical Kramer sampling theorem provides a method for obtaining orthogonal sampling 
theorems in the setting of integral transforms. We show that Kramer's sampling theorem (Kramer 
1959) can be viewed as a statement about orthogonal bases of ^{S)^: 

Proposition 5.1 Assume we are in the setting of Theorem 3.1, its assumptions being fulfilled, 
with the r.k.h.s.s all being of dimension d £ N°°. Moreover, assume there is a sequence of 
Tjn £ X,n G such that {cu i~>- E^Ki^{-,yn)}n£'N forms a complete orthogonal set o/L2(r2 — 
C'j/i). Then {K{-,yn)}n€f<! forms a complete orthogonal set in H; in particular one has 



Is 

for all g Cz H and x Cz X 



a(-)-j:9iyn)^^y (56) 



Proof Let f,g ^ H O J\f{S)-^ using the same notations as in Section 3. The statement that 
{K{-,yn}ne'N forms an orthogonal set is due to 

(i?./.,i?..g.)L2(o^c^,^) = {S{f),S{g)U, (57) 

cf. (10). Moreover, assume = {E.f.,E.K.{--,y„))i^2^n-^c'',,,) = {S{f), K{-,yn))-u for all n G N. 
By completeness we obtain E^^fi^ — /i-a.e. and hence 

Sif)ix) = f {E^f^,E^KU-,x))M^) = (58) 



for all X € X establishing the completeness of {K(-, j/n}„gN- By continuity of point evaluation in 
H, (56) follows. □ 

Note that we restricted ourselves to Hilbert spaces with identical dimensions merely for notational 
simplicity. 

Using Corollary 4.1, an immediate consequence of Proposition 5.1 is the Kramer sampling 
theorem, see (Jerri 1977). 

Corollary 5.2 Let fi be a measure on a space J7, and k : X x Q C a function such that for 
every x G X , the function lo i— >■ k{x,Lo) is measurable and in L2(r2,/i). // there exists a sequence 
of Vn G A",?! G N such that the set {k{yn, OlneN forms an orthogonal sequence in L^(ri,/i), then, 
for every a G L2(ri,/i), we have the sampling equation 



t \ ( \u \A ( \ ^ ( Jn^(a;,^)fc(2/n,w)d//(a;) 
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Remark 5.3 Let us point out that in the specific case where X — R'^ and y„ e Z'' for n e N, it is 
well known that the set {fc(?/„, •)}„gN forms an orthogonal sequence iff the bracket [fc(0, •), A:(0, •)] 
is constant, the latter being defined through [f,g\{uj) = J2neZ'^i^f)i'^ + ^){^9){'^ + where F 
again denotes the Fourier transform; see e.g. (Jetter and Plonka 2001) for details and extensions 
in this direction of shift invariant spaces. 



6 Measurements and representation 

In the situation of Corollary 4.1, we consider measurements either of the image f = Sa of a under 
S, or of a itself. In the former case, we shall determine the corresponding pre-image a G L2(ri, /i), 
in the latter case we shall be interested in determining the pointwise error in the image domain 
when interpolating the measurements of the pre-image a. 



6.1 Representation when observing the image 

First of all let us shortly recall the advantage of using a r.k.h.s. % for Tikhonov-like regularisation, 
which rests on the fact that every minimiser can be expressed as a linear combination of the r.k. 
K{-, •) evaluated at the sampling points {a;i}i=i^. . jv due to the representer theorem. The following 
version of this theorem is due to Scholkopf et al. (2000): 

Theorem 6.1 LetH be a r.k.h.s. H with r.k. K : XxX C. Furthermore, let {xi}i=i,,,,^N ^ X 
be a set of sampling points, A : R>o — R a non- decreasing function and L : — ?► R U {oo} an 
arbitrary loss function. Then the functional J : "H — >■ R U {oo} given by 

J{f)^L{f{x,),...,fixN))) + Xi\\f\\) (60) 

possesses a minimiser f of the form 

N 

f = aiK{-, Xi) with ai, . . . , un G C. (61) 
1=1 

Furthermore, if X is strictly monotonically increasing, every minimiser is of form (61). 

Proof Let / be a minimiser of (60). Consider the orthogonal projection /y of / onto the finite 
dimensional subspace span{i^(-, x^) : 1 < « < N} and denote by f± the orthogonal part of /. 
Then by the reproducing property 

f{x,) - (/, K{; X,)) - (/|| + f^,K{-,X,)) = (/|| , K{; X,)) = /|| (x,) (62) 

and hence L{f{xi), . . . , fix^)) = L{f\\ (xi), ■ . • , /|| {xn))- On the other hand, we have 

ii/ir = ii/iiir + ii/±ir>ii/iiir, (63) 

and therefore /y is also a minimiser of (60) since A is non-decreasing. In the case of a strictly 
increasing A, we get A(||/||) > A(||/||||) if > 0, and hence / = f\\. □ 

Now, using the notation of Section 3, and assuming that % arises from integrating r.k.h.s.s as 
in Theorem 3.1, consider the functional JtH— 5-RU{oo} 

J{g) = L{{S{g)){x,), iSig))ixN)) + X{\\gU 

= L{iSig))ix,), {S{g))ixr,)) + X \\g\\\\l + \\g^\\l) (64) 

with g± e JViS)-^ and 5y G J^{S) the respective orthogonal projections of g e "H. Then, if 
A is strictly monotonically increasing, the minimiser g of (64) is an element of JV{S)-^. Since 
S{J\f{S)-^) = H we have established that minimising (64) is equivalent to minimising (60), the 
minimisers being related hy f = S{g) € "H. The representer theorem above then yields that every 
minimiser of (64) fulfils 

N N 

S{g) ^'^aiK{-,Xi) ^ / '^aiKu:{-,Xi)dn{uj) , (65) 
i=i •'^^ i=i 
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and thus, cf. (19), 



N 



g = Y,o^^K{■.x^)eN{S)'' . (66) 

This calculation becomes particularly interesting when % is obtained via an integral transform, 
i.e. in the situation of Corollary 4.1; it then yields a method of estimating preimages of / G "H 
from measurements. In fact, we then always reconstruct the preimage from M{S)^; let 

S : L2(17,m) -> 5(L2(f7,/i)), {Sa){x)^ [ a{uj)k{Lj,x)dfi{uj). (67) 

Now, minimising J in (64) is equivalent to minimising J : L2(r2,/i) — > RU {oo}, 
J{a) - L((5(a))(xi), . . . , (5(a))(a;^)) + A(||a|U,(o)) 



for a G L2(f2, fi), and by the derivation above its minimiser takes the form a{uj) — X^i^i Q^ifc('^i ^i)- 
In summary, we obtain the following result: 

Proposition 6.2 Let S be an integral transform with kernel fc : A" x 51 — > C satisfying uj i— >■ 
k{x,uj) e L2(f2,/i) for every x € X, as in Corollary 4-1, o,nd assume L,X satisfy the assumptions 
of Theorem 6.1. Then the functional J : L2(51, /i) R U {oo}, 

J(a) = L((5(a))(xi), . . . , {S{a)){xN)) + A(||a|U,(o)), (69) 
possesses a minimiser a* admitting the representation 

N 

a*{uj) =^^aik{uj,Xi). (70) 

i=l 

Indeed, for any minimiser a* of J, we have that S{a*) minimises J in Theorem 6.1, while for any 
minimiser f G H of J the unique pre-image a* £ JV{S)'^ with S{a*) = f minimises J. 

Furthermore, if A is strictly increasing, then any minimiser a* of J is of this form. In fact, 
then a* eAf{S)^. 

Remark 6.3 A typical situation where the minimisation of J (68) occurs is in inverse problems: 
one only observes the image S{a) of the function a e L2(r2, fi) of interest, usually with some noise; 
here, S is an integral operator. One then wants to find a function a* which is close to the data as 
measured by L but not too large in norm, as the perturbed data no longer lie in the range of S, i.e. 
in the r.k.h.s H, whence the regularisation via A. There is then a well-developed theory showing 
under which conditions the minimiser a* will be close to the true function a, see e.g. (Engl et al. 
1996). 

As an example consider for some regularisation parameter 7 > and data y — {yi)fLi G 
the quadratic loss function 

N 

Jia) = E \y^^ - iS{a)){x^)\' + 7ll«llL(n,^) , (71) 
1=1 

to be minimised over a G L2(ri,/i). Then, with the matrix H = (K{xi,Xj))^^_^ e C^^^, the 
loss in dependence of a is given by 



\Ha-yf+-/ 



N 



y^a,ir(-,Xi) 



2 

— a* H* Ha — a* H*y — y* Ha + y*y + ja* Ha. (72) 

H 



A short calculation shows that the minimising a solves the equation 

iH*H + jH)a = H*y, (73) 
and thus the minimiser a* of the functional can be computed explicitly from (70). 
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6.2 Interpolating the pre-image 

We now change our viewpoint, assuming that we observe the pre-image a. For this to make sense, 
in addition to the assumptions of Corollary 4.1, let Q he a. r.k.h.s. over i7 with kernel G such that 
the diagonal lo d[w) = G{w,w) = ||G(-,a;)||e G L2(ri,/x); thence G{-,uj) € L2(fi,^) for every 
Lo, so Q (Z L2(r2, /i) with a continuous embedding whose norm is bounded by the ^2(^1, /i)-norm of 
the diagonal: 

l|a||L.(n,M) = 11^ ^ (aH,G(-, w))e|lL,(o.p) < \Mg\\dh,(n,^) (74) 

for every a E Q. Now, let C 51 be a set on which we observe a £ G, and denote by Pw the 
corresponding power function, given by 

Pwiio)^ = G{oj,Lu)-Gw{^,u:), (75) 

where Gw is the kernel of the sub-space Gw generated by {G(-,w) : u E W}; observe that 
Pw G L2(fl, fJ-), too. 

We are interested in estimating the pointwise error at x £ X made by approximating a by the 
interpolant aw € Gw with 0^/(0;) = a(w) for all lo € W, i.e. for f — S{a) and fw — S{aw) we 
estimate 

\f{x) - fw{x)\ = \{S{a) - S{aw),K{-,x)U\ 

< |(a-aM/,fc(a;,-))L2(o,p)| 

< \\a - aw\\h2(n,f.) OIIlsCo,^) 

< \\Pw\\L,in,^.) WH^, ■)\\L,(n.^.) ; (76) 

observe that the first inequality is in fact an equality if M{S) — {0}. Recall that a^y can also be 
characterised as the function in G with minimal norm interpolating a{u!) at all lo G W. 

Proposition 6.4 Let (fl,/i) be a measure space, W G ^ a set of sample points, X a set and 
G C L2(ri,/i) a continuously embedded r.k.h.s. Moreover, assume that we are given an integral 
transform S with kernels /c : A" x f2 — > C such that S'(L2(ri, /i)) is a r.k.h.s. as in Corollary 4-1- 
Then for a £ G and all x £ X , the pointwise difference between the image of the minimum-norm 
interpolator aw, with corresponding power function Pw 7 o-nd the image of a at x can be bounded 
by 

\S{a){x) - S{aw){x)\ < \\a\\g \\Pw\\L2{u,t^) \\k{x, ■)\\L2{n,ii)- (77) 

Note that, in order to put this to practical use, one will have to be able to compute S{G{-,uj)) 
to obtain the image fw of the interpolant aw under S] indeed, if W is finite, aw is given by 
Z]t^gw"'^^(''^) fo'" '^ome G C. 
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